tRNA expression and modification landscapes, and their dynamics during zebrafish embryo development

Abstract tRNA genes exist in multiple copies in the genome of all organisms across the three domains of life. Besides the sequence differences across tRNA copies, extensive post-transcriptional modification adds a further layer to tRNA diversification. Whilst the crucial role of tRNAs as adapter molecules in protein translation is well established, whether all tRNAs are actually expressed, and whether the differences across isodecoders play any regulatory role is only recently being uncovered. Here we built upon recent developments in the use of NGS-based methods for RNA modification detection and developed tRAM-seq, an experimental protocol and in silico analysis pipeline to investigate tRNA expression and modification. Using tRAM-seq, we analysed the full ensemble of nucleo-cytoplasmic and mitochondrial tRNAs during embryonic development of the model vertebrate zebrafish. We show that the repertoire of tRNAs changes during development, with an apparent major switch in tRNA isodecoder expression and modification profile taking place around the start of gastrulation. Taken together, our findings suggest the existence of a general reprogramming of the expressed tRNA pool, possibly gearing the translational machinery for distinct stages of the delicate and crucial process of embryo development.


Introduction
tRNAs are the essential adapter molecules mediating the translation of messenger RNAs in the decoding centre of the ribosome.Besides the crucial role in protein synthesis, new regulatory functions of tRNAs and derived fragments have been uncovered in recent years (reviewed in ( 1 )).Yet relatively little is known about the actual tRNA expression landscape and eventual dynamics in different physiological and / or pathological conditions.tRNA genes are present in multiple copies, with repertoires in higher eukaryotes ranging from a few hundred (about 600 copies in the human genome) up to thousands (e.g. over 20 000 in zebrafish) ( 2 ).These estimates are based on computational predictions, and to date it is not clear whether all putative tRNA genes are actually transcribed, and whether their expression levels are modu-lated.Noteworthy, the expression of ribosomal RNAs has been shown to be heavily regulated during development in vertebrates, with distinct sets of rRNA genes being expressed in the oocyte and early stages of embryonic development versus later embryonic stages and adult zebrafish ( 3 ,4 ).While similar findings were recently reported also for other non-coding RNAs (5)(6)(7), reliable data on tRNAs remain scarce and underrepresented in transcriptomic studies.
The study of tRNA expression is complicated due to technical reasons and the biology of tRNAs themselves.Typically, the genome encodes multiple copies of tRNAs for each anticodon.These multiple copies can be identical in the sequence of the mature tRNA, or they can be isodecoders, i.e. have identical anticodon sequence but differ in the rest of the sequence by a few or a substantial number of nucleotides ( 2 ).
When (small) RNAs are analysed by next-generation sequencing (NGS), reads mapping to multiple sites in the genome, so called multimappers, are usually discarded from further analysis.Furthermore, reads originating from tRNAs that present a few mismatches can also lead to mapping artefacts, or be discarded when the alignment parameters are not suitably adjusted.As a consequence, standard NGS data analysis pipelines cannot be used for analysis of tRNA expression.
One additional challenge in the study of tRNAs is the presence of abundant modifications.To date, about 170 RNA modifications have been described, most of which are found in tRNAs ( 8 ).With up to 20% of nucleotides modified at the sugar or base moiety, tRNAs are the most modified RNA molecules in the cell.Many of the described modifications, particularly those present at the Watson-Crick face of the nucleobase, interfere with reverse transcription and can cause early termination or sequence errors in the cDNA.These 'errors' sum up in the diversity of the reads originating from the tRNA transcriptome, requiring an optimized analysis approach.In recent years, new NGS strategies actually harnessed those sequencing errors for detection of RNA modifications (reviewed in ( 9 )).Furthermore, the availability of enzymes able to remove RNA methylations (which are the most abundant tRNA modifications ( 10 )), new reverse transcriptases, and improved mapping strategies have enhanced our ability to investigate tRNA expression and modification (11)(12)(13)(14).Still, to date only few studies have addressed the possible modulation of tRNA expression and modification, for example in different tissues ( 15 ) or in response to stress and nutrient availability (see review ( 16 )).Remarkably, mutations in the majority of tRNA modification enzymes are associated to diseases, typically characterized by developmental disorders and phenotypes restricted to specific tissues ( 17 ).These observations suggest that the hypomodification of tRNAs or the consequent depletion of mature tRNAs in their correct, functional form may be particularly detrimental in specific developmental stages and / or cellular contexts.
Here we analysed the dynamics of tRNA expression and modifications during embryo development in vertebrates, using zebrafish ( Danio rerio ) as model organism due to its fast and well characterized embryonic development ( 18 ).We optimized previously published small RNA sequencing protocols and devised an improved in silico analysis pipeline to delineate the expression profile of nuclear-encoded as well as mitochondrial tRNAs during the early stages of zebrafish embryo development.Furthermore, we have profiled a wide array of different tRNA modifications.We show that tRNA expression levels vary during embryo development, and specific tRNA modifications are dynamic in a tRNA-and developmental stage-specific manner.Taken together, our results suggest the existence of a dynamic reprogramming of tRNA expression and modification during embryogenesis, possibly contributing to the fine-tuning of the protein synthesis machinery during development.

Fish husbandry
Zebrafish ( Danio rerio ) were raised according to standard protocols (28 • C water temperature; 14 / 10-h light / dark cycle).TLAB fish, generated by crossing zebrafish AB and the natural variant TL (Tupfel Longfin), were used for all experiments.All fish experiments were conducted according to Austrian and European guidelines for animal research and approved by the Amt der Wiener Landesregierung, Magistratsabteilung 58-Wasserrecht (animal protocols GZ 342445 / 2016 / 12 and MA 58-221180-2021-16).

RNA and tRNA fraction isolation
Samples from individual stages were collected as follows: between 100 and 200 chorionated embryos were collected per time point at the indicated stages.Embryos were transferred into 1.5 ml tubes and as much water as possible was removed before embryos were homogenized in 500 μl of TRIzol (Invitrogen).To obtain eggs, females were anesthetized using 0.01% Tricaine and eggs were isolated via a standard protocol in zebrafish (squeezing) ( 19 ).Between 100 and 200 eggs were activated by adding fish water (5 mM NaCl, 0.17 mM KCl, 0.33 mM CaCl 2 , 0.33 mM MgSO 4 , 0.0001% methylene blue) and incubated for 10 minutes before collection and homogenization in TRIzol.To obtain ovaries, females (one for each replicate) were dissected.Ovaries were harvested and homogenized in 500 μl TRIzol.
Total RNA was extracted using the standard TRIzol protocol.Total RNA was separated by denaturing polyacrylamide gel electrophoresis (PAGE), and the small RNA fraction ( ∼60-120 nt) containing tRNAs was eluted in elution buffer (1 mM ED TA, 0.5 M NH 4 O Ac and 0.1% SDS), phenol-chloroform extracted and precipitated.
Bisulfite conversion: 500 ng of demethylated tRNA were treated with the EZ RNA Methylation Kit (Zymo Research) according to manufacturer's instructions.
Deacylation and dephosphorylation: tRNA samples were deacylated by incubation at 37  RNase Inhibitor (NEB), 5 mM DTT and pre-incubated for 10 min at 42 • C.Then, dNTPs were added to a final concentration of 1.25 mM, followed by incubation at 42 • C for 16 hours.The remaining RNA in the RT sample was hydrolysed by adding 1 μl 5 M NaOH and incubation of 95 • C for 3 min.The cDNA was loaded on a 10% denaturing polyacrylamide gel and gel pieces were excised above the unextended RT primer till 400 nt, crushed, and the cDNA was eluted two times for 1 h at 70 • C, 1500 rpm in 325 μl 1 × TE pH 8.The two elutions were pooled and isopropanol precipitated.
cDNA circularization and amplification: the recovered cDNA was circularized by incubation in a reaction volume of 10 μl in the presence of 50 U CircLigase™ ssDNA Ligase (Epicentre), 1 M betaine, 1 × CircLigase buffer, 50 μM ATP and 2.5 mM MnCl 2 for 3 hours at 60 • C, followed by 10 min at 80 • C to deactivate the enzyme.Amplification of 4 μl circularized cDNA was performed in a reaction volume of 48 μl in the presence of 500 nM PCR forward primer (AA TGA T ACG-GCGA CCA CCGA GA TCT ACA*C, where * is a phosphorothioate bond) and indexed NEBNext® Multiplex Oligos for Illumina (NEB), 0.48 U KAPA HiFi Polymerase (KAPA Biosystems), 1 × KAPA HiFi GC Buffer and 30 μM dNTP Mix with an initial denaturation at 95 • C for 3 min, followed by 13-15 cycles of 98 • C for 20 s, 62 • C for 30 s and 72 • C for 30 s. Libraries were purified with DNA Clean&Concentrator-5 kit (Zymo) or GeneJET Gel Extraction and DNA Cleanup Micro Kit (Thermo Fisher Scientific), quantified using Qubit dsDNA HS Assay (Invitrogen) and sequenced on a NextSeq500 instrument (Illumina) at the Core Facilities of the Medical University of Vienna, a member of VLSI.

tRNA NGS data analysis
Details on the NGS data analysis can be found in Supplementary File 1 as Supplementary Text , Supplementary Table S1 , Supplementary Figure S2 and Supplementary Figure S3 .A comparison between tRAM-seq and mim-tRNAseq is provided in Supplementary Figure S4 .The results of the whole analysis are available in Supplementary Data 1.Here in the following is a short summary of the strategy and steps performed.
Preprocessing: The NGS data sets were preprocessed by trimming adapters, extraction of UMIs, trimming of untemplated nucleotides at the 5 end, size filtering and quality filtering.
Reference genome: The initial tRNA reference genome was created by collecting mitochondrial tRNA sequences from mitotRNAdb and genomic (nuclear) sequences from a tRNAscan-SE run provided in GtRNAdb, including low confidence and pseudogene predictions.All DM samples were mapped to this initial reference genome (segemehl) and any genes with a coverage of at least 500 reads per million (RPM) in at least one sample were included in the reference genome.The reference genome was further refined manually.Canonical positions were manually assigned to the RFAM tRNA alignment (RF0005) and then transferred to the reference genome by aligning the tRNA references to the RFAM alignment.
Mapping: Mapping was performed with segemehl with optimized parameters for short reads.Bisulfite treated samples were mapped with the bisulfite version of segemehl (-F1) and RNA-specific post-processing was applied.
Clustering: References were clustered based on (i) sequence similarity and (ii) by multimapper counts obtained from mapping.In the first step any references that differ in at most 3 positions were merged.In a second step, any clusters that consisted of more than 50% reads that can also be found in another larger cluster were merged into this larger cluster.The mapping results were merged for all references within one cluster.
Abundance: The abundance of each cluster was computed based on the DM samples by mapping on the manually refined reference set, random assignment of multi-mapping reads to one of the best matching references, counting of the mapped reads per reference, summing up the abundance of all references in a given cluster and normalizing the abundance count by the total count of mapped reads.
Misincorporation rates: Possible modification sites and possible changes of modification level were identified by computing position-wise misincorporation rates.For each reference position, the total number of reads that cover the given position were counted.Furthermore, we counted how many of those reads mismatch at the given position.The overall misincorporation rate per position in the clusters was computed by first adding up the mismatch counts and total coverage count for equivalent positions in the tRNA references within the given cluster, and then dividing the mismatch count by the total count.Equivalent positions between tRNA references within a cluster were defined by alignment against the tRNA RFAM alignment (RF0005).R T stop fraction: R T stops were analysed by computing the fraction of reads that start immediately after a given position.The RT stop fraction for each position was calculated dividing the number of reads that start at one nucleotide downstream of the position of interest by the total number of reads that cover this downstream position in the reference tRNA.The overall RT stop fraction of a position within a cluster was computed by summing all read-starts that map to one position downstream of the cluster position of interest and dividing by the number of reads that cover this downstream position in the cluster.m 5 C fractions: Putative m 5 C modification sites and m 5 C modification dynamics were detected based on the BS samples as C-retention rate.For single tRNA references, the Cretention rate was only computed for C positions and the count of reads that contain a C in the mapped position was divided by the count of mapped reads that contain a C or a T in the given position.To derive C-retention rates per cluster and per position, the count of mapped Cs was obtained as the number of reads with a C that is mapped to a reference C; the count of mapped Ts was obtained as the number of reads with a T that is mapped to either a reference C or T. The Cretention rate was computed as the count of Cs divided by the sum of the counted Cs and Ts.For interpretation details see also Supplementary Text in Supplementary File 1 .

Genome context analysis
tRNAs with a minimum mean fraction of 0.002 reads mapped at either the 1k-cell or 24 hpf stages were included in the genomic context analysis.Genomic locations were retrieved from GtRNAdb.The number of tRNAscan-SE predicted tRNA genes within a 10 kb range (5 kb upstream and 5 kb downstream) was counted for each tRNA gene.For each unique isodecoder, the mean of these counts over all identical gene copies was calculated.

Gene expression analysis
mRNA sequencing was performed using QuantSeq 3 mRNA-Seq Library Prep Kit (FWD) for Illumina (Lexogen).500 ng total RNA per sample were used.Single-indexed QuantSeq libraries were QC-checked on a Bioanalyzer 2100 (Agilent) using a High Sensitivity DNA Kit for correct insert size and quantified using Qubit dsDNA HS Assay (Invitrogen).Pooled libraries were sequenced on a NextSeq500 instrument (Illumina) in 1 × 75bp single-end sequencing mode.On average 8 million reads were generated per sample.Reads in fastq format were generated using the Illumina bcl2fastq command line tool (v2.19.1.403).Reads were trimmed and filtered using cutadapt (version 2.8) to trim polyA tails, remove reads with N's and trim bases with a quality of less than 30 from the 3 ends of the reads.On average, 7 million reads were left after this procedure.The composition of the reads after each trimming step was assessed using FASTQC (v0.11.9).Reads in fastq format left after the last trimming step were aligned to the Danio rerio (zebrafish) reference genome version GRCz11 with Ensembl 104 annotations using STAR aligner (version 2.6.1a) in 2-pass mode.Reads per gene were counted by STAR.Raw read counts were normalized and variance stabilizing transformation (vst) was done with DESeq2 (version 1.22.2).The mean vst-counts for known tRNA modifying enzymes (genes) per timepoint were plotted in a heat-map.Genes of interest were clustered and sorted corresponding to a UPGMA tree based on their similarity in expression levels as implemented in scipy as hierarchical linkage clustering method (scipy .cluster.hierarchy.linkagewith method = 'average' and metric = 'euclidean').

Codon frequency and correlation analysis
The codon frequency analysis was based on the same QuantSeq data as the gene expression analysis.Raw read counts were normalized by total number of mapped reads.Coding sequences were obtained from Ensembl ( https://ftp.ensembl.org/pub/ release-107/ fasta/ danio _ rerio/ cds/ ) ( 24 ).A representative gene isoform was selected based on primary isoform annotations in APPRIS 2023_01.v48database ( https: //appris.bioinfo.cnio.es, zebrafish assembly version GRCz11, gene dataset Ensembl 104) ( 25 ,26 ).For each genomic protein coding gene, the occurrences of each codon was counted and multiplied by the genes expression level (normalized read count).To obtain the codon frequency, those per codon counts were summed up over all genes and then divided by the total number of codons.
The correlation between tRNA abundance and the frequencies of the corresponding decoded codons was calculated using the Pearson correlation coefficient, based on anticodon families.tRNA abundance was determined by dividing the number of reads mapping to a reference within a specific genomic anticodon family by the total reads mapping to a genomic tRNA reference.The corresponding codon frequencies were obtained by summing the frequencies of the decoded codons for each anticodon.Pearson correlation coefficients and their associated P -values were computed for the tRNA abundance and codon frequencies at a specific time point, as well as for the time-course data of tRNA abundance and codon frequency for individual anticodon families.Multiple testing corrections on the P -values from the time-course correlations were performed using the Holm-Bonferroni method.

Nor thern blot ting
Approximately 2 μg of zebrafish total RNA were separated on a 15% denaturing polyacrylamide gel and imaged after GelRed staining.After pilot experiments, the amounts of total RNA for different samples were adjusted to have comparable amount of tRNA loaded per lane.The RNA was transferred to a Hybond-N+ Nylon transfer membrane (Cytiva) with a Trans-Blot SD blotting apparatus (BioRad) followed by crosslinking at 0.12 J / cm 2 in a CX-2000 UV crosslinker (Analytik Jena).The membrane was then rinsed with 6X SSC and prehybridized in 6 × SSC, 10 × Denhardt's solution and 0.5% SDS rotating at 40 or 42 • C. The following antisense oligonucleotides were 32 P-labelled with T4 PNK (NEB) and used as probes, hybridized at the indicated temperature overnight: anti-Asp-GUC-1 GCGGGGA T ACTT ACC (40 The next day the membrane was washed once for 15 min with 6 × SSC and 0.1% SDS and two times for 15 min in 4 × SSC and 0.1% SDS.The membrane was exposed to a Storage Phosphor Screen and signals acquired on a Typhoon scanner (Cytiva).Membranes were reused after stripping of the probe by incubation 3 times for 15 min at 75 • C in stripping solution (40 mM Tris-Cl pH 7.5, 0.1 × SSC and 1% SDS).Bands were quantified by densitometry with Image Lab 6.1 (Bio-Rad).The bands' intensity was normalized by the intensity of the tRNA band in the polyacrylamide gel stained with GelRed prior to blotting.

Primer-extension assay
Primer extension experiments to detect modifications were performed as previously described ( 27 ) with minor adjustments.The method is based on the synthesis of cDNA using a radioactively labelled primer that anneals a few nucleotides downstream from the modified RNA residue of interest.The primer is extended using an RT sensitive to modification, which results in chain terminations in correspondence of modifications (as opposed to the highly processive, modificationtolerant TGIRT which rather introduces a mismatch).The primer was annealed in a volume of 2.5 μl containing 1 pmol primer (PheGAA-A14 TTCAGTCT AA TGCTCTC-CCA; GluCUC-A58 TGGTTCCCTGACCGG), 0.5-1.0μg total RNA, 50 mM Tris-HCl pH 8.3, 60 mM NaCl and 10 mM DTT at 75 • C for 10 min followed by slowly cooling down to 23 • C. Then 2.5 μl containing dNTPs (10 mM), 10 × Primer extension buffer and 2 U AMV Reverse Transcriptase (Promega) were added to the annealing mixture and incubated for 1 hour at 42 • C. The reaction was stopped by adding 5 μl of 2 × RNA loading buffer.5 μl of reaction were separated on a 15% denaturing polyacrylamide gel.The gel was exposed to a Storage Phosphor Screen at -80 • C. The RT stop fraction for the modification assayed was determined by dividing the band intensity of the stop observed immediately at the 3 of the modification by the sum of the downstream bands intensity (consisting of stops due to other modifications or readthrough to the RNA end.All the bands were quantified with Image Lab 6.1 (Bio-Rad).

tRAM-seq: an optimized library preparation and analysis protocol for tRNA profiling
To analyse tRNA expression and modification during embryo development, we isolated RNA from activated zebrafish eggs, as well as from embryos at the 4-cell stage (approximately one hour post-fertilisation, 1 hpf), 1000-cell stage (1k-cell, 3 hpf), 5 hpf, bud stage (10 hpf), and 24 hpf ( Supplementary Figure S5 ); furthermore, we included samples of ovaries from adult female zebrafish.
For the preparation of sequencing libraries and their bioinformatic analysis, we built upon recent advancements in the use of NGS for similar purposes and devised an optimized tR NA a bundance and m odification seq uencing (tRAM-seq) protocol (Figure 1 ).The presence of RT-interfering modifications, the most abundant being methylations at the Watson-Crick face of the nucleobase ( 10 ), can result in underrepresentation of the most heavily modified tRNAs in NGS data.Similar to others ( 12 ,13 ), we used demethylation to minimize such bias and improve mapping, coverage, and quantification of the tRNAs.We used three variants of a demethylase of bacterial origin, AlkB, that were shown to have robust demethylase activity on ( 13 ,20 ,29 ).Thus, for each sample we isolated the tRNA fraction and split it in (i) a fraction not subjected to demethylation (mock sample) used for modification detection (see following sections), (ii) a fraction treated with the mixture of the three demethylases used for tRNA quantification (DM sample) and (iii) a fraction subjected to demethylation and bisulfite conversion for detection of m 5 C modification (BS sample) (Figure 1 A).
For library preparation, we combined protocols previously developed for small RNA sequencing ( 11 , 13 , 23 ), and integrated them in a workflow for analysing zebrafish tRNA (Figure 1B; see also Supplementary Text and Supplementary Figure S4 for a comparison with previous protocols).We ligated the end-repaired tRNA samples to a pre-adenylated adapter, which bears six randomized nucleotides at the 5 -end in order to minimize sequence-preference bias in the ligation step ( 30 ).It was previously shown that different RTs are differentially affected by the modifications and structure of the Library construction and computational w orkflo w of tRAM-seq.( A ) The tRNA fraction is isolated from total RNA and subsequently subject to treatments for demethylation (AlkB treatment), bisulfite conversion, and end repair (deacylation and T4 PNK treatment).( B ) The tRNA fraction is ligated to the pre-adenylated adapter (red) and reverse transcribed starting from the RT primer (green).Subsequently the cDNA is circularized and amplified from the RT primer sequence.The RT primer contains two internal spacers ( •) to prevent circular amplification.Both the adapter and RT primer contain random nucleotides (N) at the 5 -end to reduce ligation bias and remo v e PCR duplicates during computational analysis.( C ) Computational pipeline of tRAM-seq data analysis.
template RNA, in terms of both processivity and fidelity ( 31 ).The thermostable group II intron reverse transcriptase TGIRT ( 32 ), with reaction conditions improved in a recent report ( 11 ), was shown to be among the most processive enzymes: it is able to proceed past structure and modifications usually acting as roadblocks for reverse transcription, leaving a characteristic misincorporation signature at modified sites ( 31 ).We reverse-transcribed the adapter-ligated RNA with TGIRT, and then circularized the synthesized cDNA with the 5 -end of the RT primer, which also included three randomized nucleotides to minimize ligation bias.We amplified the obtained circular cDNA and sequenced the library by Illumina sequencing.The randomized nucleotides included in the library construction (6 in the ligated adapter and 3 in the RT primer) were used as unique molecular identifiers (UMIs) to remove PCR duplication products.
The presence of multiple, identical or nearly identical copies per tRNA gene requires optimized mapping strategies for the analysis of tRNA sequencing data.We and others have previously tackled the issue by using clustered references ( 11 ,33-35 ), where multiple identical or nearly identical tRNA genes are collapsed to one representative reference.However, previous work did not address the case of the model organism zebrafish, which poses a major challenge of its own, with currently 8676 predicted high confidence tRNA genes, and additional thousands of lower-scoring predictions ( 2 ).Noteworthy, additional sequence variation can also be introduced by the presence of SNPs within laboratory zebrafish populations.After preliminary analysis, we found that our NGS data included reads from tRNA genes that are not part of the high-scoring GtRNAdb set.Therefore, we opted to include all predicted ∼20 000 genes (including low-confidence predictions and pseudo genes) and performed mapping of the DM libraries on this initial set of tRNA candidate genes (Figure 1 C).All tRNA genes represented by at least 500 RPM were included in our reference for further analysis, and after further filtering (see Methods and Supplementary File 1 for details) the references were collapsed to 68 distinct clusters defined by sequence similarity and extent of multimapping reads (see methods and Supplementary Table S2 for details).The 68 clusters also included the 22 mitochondria-encoded tRNAs, which are all present as single copies in the organellar genome, corrected for SNPs that we identified in the sequencing data and that we confirmed by Sanger sequencing of amplified mitochondrial DNA (see Supplementary Text in Supplementary File 1 and Supplementary Figure S6 ).
All libraries were mapped to the refined, manually-curated, clustered reference, and coverage and error rate were calculated for every position in the tRNA cluster sequences.In Figure 2 , we show example coverage plots of the tRNA cluster Gln-CTG / TTG of a zebrafish ovary sample.The cluster includes reference tRNA genes differing at a few positions, namely canonical 34, 44, and 69, as indicated in the reference composition.Any other deviation from the expected sequences are flagged and highlighted.In the mock sample, the expected modifications m 1 G9 and m 1 A58 are clearly detected as a high RT error rate (Figure 2 A).In the DM sample, nearly all errors at those positions were abolished, and only low, background-level errors are visible throughout the tRNA sequence (Figure 2 B).The extent of full-length tRNA reads increased slightly in the DM versus the mock library, suggesting that, although our library preparation was efficient in the mock sample, the demethylation still improved the coverage toward the 5 -end of the tRNA cluster.The coverage plot for the BS library shows detection of m 5 C methylation at position 49 and 50 (Figure 2 C), as expected for these tRNAs in vertebrates ( 8 ).Only a low level background of C retention is visible throughout the tRNA cluster sequence, indicating efficient bisulfite conversion.Similar results were obtained for all 68 tRNA clusters ( Supplementary File 2, 3, 4 ), thus showing that our tRAM-seq protocol enables us to assess tRNA abundance and modification in an organism with a very complex tRNA repertoire like zebrafish.

tRNA isodecoder identity and abundance are dynamic during embryo development
To profile the levels of tRNAs during early embryonic development, we used the tRAM-seq coverage data obtained from the DM samples.As shown in Figure 3 A, we observed a wide range of expression levels for the different tRNA clusters, where tRNA-Asp-GTC was the most abundant, representing up to 10% of the total tRNA reads.Within the tRNA transcriptome, we observed diverse dynamics in tRNA levels across the samples analysed, with the expression of some tRNA clusters increasing or decreasing during embryogenesis, while other tRNA clusters appeared stable (Figure 3 A, B).Noteworthy, we observed that the amount of total tR-NAs as fraction of total embryo RNA (mostly consisting of rRNA) was highest in ovaries but lowest in activated eggs, and increased over the time course analysed until 24 hpf,  ( Supplementary Figure S7 A).The fraction of mitochondrial tRNA reads appeared to be highest in activated eggs and early embryonic stages, and decreased over time (Figure 3 A,B and Supplementary Figure S7 B).Consequently, the apparent decrease in mitochondrial tRNAs was reasonably due to the prevalent increase of nucleo-cytoplasmic tRNAs rather than an actual decrease of the mitochondria-encoded ones.Normalizing individual mitochondrial tRNAs to the mitochondrial pool instead of the total tRNA repertoire, we observed little changes in the relative abundance of individual organellar tRNAs ( Supplementary Figure S7 C).
By deconvoluting the tRNA clusters into individual tRNA isodecoders, we observed that in most instances a few tRNA isodecoders were the main representatives of each tRNA cluster.Remarkably, for a subset of tRNA clusters we observed an identity change between early and later developmental stages, with a substantial switch in expressed tRNA isodecoders taking place from 5 hpf onward, as exemplified by the clusters Asp-GTC, Leu-CAA and Thr-AGT / TGT / CGT (Figure 3 C).Similar results were observed for other tRNA clusters, although not for all clusters ( Supplementary File 5 ).
To validate our tRNA abundance results obtained by tRAM-seq, we chose northern blotting as orthogonal method.Northern blot relies on probe hydridization and is not affected by the same potential biases in ligation, RT, and PCR amplification that can affect NGS library preparation.In addition, we have designed our probes to target regions within the tRNA sequences which are devoid of modifications, aiming to prevent any bias due to differential modification extent.Northern blot experiments on selected, representative tRNA clusters recapitulated closely the changes observed by tRAM-seq (Figure 3 ,4 ).Remarkably, using isodecoder-specific probes and stringent hybridization / wash conditions, we apparently distinguished the isodecoders tRNA Asp-GTC-1 and tRNA Asp-GTC-2 , and confirmed their expression dynamics in the time course analysed (Figure 4 ).
Inspecting the genomic location of the tRNAs expressed at different stages, we observed that all isodecoders whose abundance increased in later stages are encoded by tRNA genes scattered across the genome (Figure 5 A, B); conversely, the genes of isodecoders present already in the eggs and early embryonal stages are encoded in multiple copies within highly repetitive regions (Figure 5 A, C).In the case of isodecoders expressed across all the time points analysed, we observed the existence of copies of the tRNA gene present in both genomic contexts, i.e. as isolated genes and in repetitive genomic regions (Figure 5 D).These observations suggest that the genomic context of tRNA genes might be linked to the specific isodecoder expression profile during different stages of embryo development.
To address whether the abundance of specific tRNAs matches the actual frequency of the corresponding codons in the transcriptome, we performed RNA-seq of the same zebrafish samples analysed by tRAM-seq, and calculated the frequency of each codon across the mRNA transcriptome ( Supplementary Figure S8 A).Comparing the codon frequency and the corresponding tRNA abundance (Figure 3 ) we found significant positive correlation at the individual developmental stages ( Supplementary Figure S8 B, C), but less so when comparing the dynamics over the different time points ( Supplementary Figure S8 D).In the case of the tRNA cluster Lys-CTT which is highly enriched toward 24 hpf (Figure 3 A), we found a consistent increase in the frequency of AAG codons at later time points ( Supplementary Figure S8 A-D).However, in other cases we observed no correlation or even negative correlation, for instance between tRNA Val-AAC and the cognate codons GTT / GTC ( Supplementary Figure S8 D).The correlation / anticorrelations were based on small changes of the relative codon frequencies over time, and when multiple comparison correction was factored in, no significant correlation was observed.Restricting the correlation analysis to the most highly expressed mRNAs, we observed virtually identical results (data not shown).Overall, the tRNA abundance dynamics during embryo development does not appear to strongly correlate with changes in the frequency of the corresponding codons ( Supplementary Figure S8 D).
tRNA modification landscape of the developing zebrafish embryo RNA modifications at the base moiety of nucleoside residues can interfere with RT, causing characteristic errors in NGS data ( 27 ,36 ).To detect tRNA modifications that induce RTerrors, we analysed the NGS results of the mock samples, i.e. libraries prepared from tRNA of zebrafish samples subject to neither enzymatic, nor chemical treatment to alter modifications before library preparation (Figure 1 ).In Figure 6 , we show the complete landscape of tRNA modifications detected in the early developing zebrafish embryo (4cell stage, 1 hpf) as inferred from the rate of mismatch in the reads spanning all 68 tRNA clusters (see also Supplementary Table S3 ).Based on available tRNA modification databases ( 8 , 37 , 38 ), we assigned the identity of the modified sites.We detected high RT error rates, and thus modification signature, at the following known methylation sites: m 1 A9 and m 1 G9 in many nucleo-cytoplasmic tRNA clusters and all mitochondrial tRNA clusters encoding a purine at position 9, m 1 A14 in Phe-GAA, m 3 C20 in elongator Met-CAT, m 2,2 G26 in many nucleo-cytoplasmic and mitochondrial tRNA clusters as well as m 2,2 G27 in Tyr-GTA, m 3 C32 in several nucleocytoplasmic clusters and 2 mitochondrial tRNAs, m 1 G37 in many tRNA clusters, m 3 Ce2 (also known as m 3 C47d) in the variable loop of Leu-CAG and all nucleo-cytoplasmic Ser clusters, and m 1 A58 in nearly all tRNA clusters (Figure 6 ).We conducted the same analysis on the sequencing data derived from the DM libraries, for which the tRNA fraction was pretreated by demethylation: all sites of modifications interpreted as m 1 A, m 1 G, m 3 C and m 2,2 G lost the modification signature after demethylation treatment ( Supplementary Figure S9 ), confirming the inferred identity.
We also detected a signature of A-to-I editing.Inosine is produced by adenosine deamination, and it base-pairs with cytosine during RT, causing incorporation of G in reads obtained by Sanger sequencing or NGS.At canonical position 34 of several tRNA clusters, we found prevalent guanosine incorporation in the reads in correspondence of an encoded A34 (Figure 6 and Supplementary File 2 ).The mismatch at A34 was not altered by demethylation in the DM samples ( Supplementary Figure S9 ), supporting the interpretation of the site as a case of editing.Another site of A-to-I editing is A37 in tRNA-Ala.In the mock samples, we detected a modification signature in Ala-AGC and Ala-TGC / CGC reminiscent of purine methylation, consisting of a combination of different mismatches ( Supplementary File 2 ).In the DM libraries, the high RT error rate at position A37 of both tRNA-Ala clusters was still present ( Supplementary Figure S9 ), but converted to nearly exclusively guanosine incorporation ( Supplementary File 3 ), confirming that the original modification found at position 37 of both tRNA-Ala clusters is 1-methylinosine (m 1 I).
In addition to simple base methylations and A-to-I editing, we detected additional modifications interfering with RT.One such RT-interfering modification is the bulky guanosine modification peroxywybutosine (o 2 yW), which is exclusively found at position 37 of tRNA-Phe in eukaryotes ( 8 ) and is expected to cause a major roadblock for RT.Indeed, at position 37 of cluster Phe-GAA we found a high RT error rate as well as a high fraction of read stops, indicative of prematurely terminated RT (Figure 6 and Supplementary Figure S10 ).Furthermore, we detected both mismatches and stops at position A37 of mt-Ser2-TGA and mt-Trp-TCA, where the modification 2-methylthio-N 6 -isopentenyladenosine (ms 2 i 6 A) was reported in human and bovine ( 39 ,40 ).
The region spanning canonical positions 20, 20a, and 20b in the D-loop showed modification signatures resistant to demethylation in several nucleo-cytoplasmic tRNA clusters (Figure 6 and Supplementary Figure S9 ), mainly consisting of misincorporations and to a lesser extent RT stops ( Supplementary Figure S10 ).Modifications known to be present in eukaryotes at these sites are 3-(3-amino-3-carboxypropyl)uridine (acp 3 U), pseudouridine ( ψ ), dihydrouridine (D), and combinations thereof ( 8 , 37 , 38 ).Besides in the D-loop, ψ is frequently found in the aminoacyl-acceptor stem, D-stem, anticodon stem-loop, variable loop, T-stem, and nearly in all tRNAs at canonical position 55; however, in none of those sites we observed a modification signature.Similarly, D is often found at canonical positions 16, 17 and but none of those positions showed modification signature in our tRAM-seq data either.Thus, we suggest that the modifications detected at positions 20, 20a and 20b correspond to acp 3 sites, being acp 3 U, acp 3 D, or acp 3 ψ .
Another tRNA modification interfering with RT is 2-methylthio-N 6 -threonylcarbamoyladenosine (ms 2 t 6 A), to date reported at position 37 of tRNA-Lys-TTT in bacteria and eukaryotes ( 8 ).In our clustering pipeline, tRNA-Lys-TTT clustered together with the suppressor tRNA-Sup-TTA.In the cluster Lys-TTT_Sup-TTA, we detected mainly RT stops ( Supplementary Figure S10 ) unaffected by demethylation ( Supplementary File 3 ), with reads terminating at posi- tion 38 and thus consistent with the presence of ms 2 t 6 A at position 37.
No modification signature (in the form of mismatches or RT stops) was detected at sites corresponding to known m 2 G, or consecutive dihydrouridine (DD) sites, which have been previously reported to mildly interfere with RT ( 27 ,41 ).This observation suggests that the highly processive reversetranscriptase TGIRT is not affected by these modifications, at least under the experimental conditions that we used.

tRNA m 5 C methylome of the developing zebrafish embryo
The C 5 -methylation of cytosine (m 5 C) is a widely conserved modification found in virtually all RNA classes in the three domains of life, and in higher eukaryotes is introduced by the methyltransferases of the NSUN family, and DNMT2 (also known as TRDMT1) ( 8 ).m 5 C was shown to be impor-tant for RNA folding and metabolism, and alterations in the expression / function of m 5 C-installing methyltransferases was linked to developmental defects and cancer (reviewed in ( 42) and ( 10)).However, previous studies on m 5 C methylation in embryo development have focused on mRNAs ( 43 ,44 ), and the elucidation of the tRNA m 5 C methylome, in particular in vertebrate embryo development, remained unexplored.
Using the BS libraries, we compiled the complete m 5 C methylome of nucleo-cytoplasmic and mitochondrial tRNAs in zebrafish (Figure 7 and Supplementary Table S3 ).We detected C retention (indicative of m 5 C methylation) in the region spanning canonical positions 48, 49 and 50 of many nucleo-cytoplasmic tRNA clusters, as well as in mt-Leu1-T AG, mt-Leu2-T AA, mt-Thr -TGT, and mt-Tyr -GTA.These positions, located at the junction between the variable loop and the T-stem, are known to be targets of the methyltransferase NSUN2, which has dual localization (nuclear and mitochondrial) ( 45 ,46 ).NSUN2 also methylates C34 of nucleocytoplasmic tRNA-Leu-CAA, which is then further modified to 2 -0 -methyl-5-formylcytidine (f 5 Cm) ( 47 ).After bisulfite treatment, f 5 Cm is not expected to be detected as C retention in NGS data like m 5 C, unless an additional protection or reduction step is included in the protocol ( 48 ).In the BS libraries, we detected modification signature in the form of C retention at C34 of the cluster Leu-CAA (Figure 7 , Supplementary File 4 ), suggesting that in zebrafish this site is either only partially formylated, or is further reduced to 5hydroxymethylcytosine (hm 5 C), which behaves like m 5 C in bisulfite treatment and sequencing ( 49 ).Similarly, C34 in mt-Met-CAT is known to be methylated to m 5 C by NSUN3 and then further modified to f 5 C, although the extent of formylation in human cells was disputed (reviewed in ( 50 )).We clearly detected C retention at C34 of mt-Met-CAT (Figure 7 ), again suggesting that either the site is only partially formylated, or it is further reduced to hm 5 C.
Finally, we detected signature of m 5 C methylation at C72 of Cys-GCA and all Thr clusters (Figure 7 ), which are known targets of NSUN6 ( 51), and at C38 of Asp-GTC and Gly-GCC / CCC, and to a lower extent of Val-AA C / TA C, which are known to be methylated by DNMT2 (52)(53)(54).

tRAM-seq identifies unexpected tRNA modifications in zebrafish tRNAs
Analysing the tRNA modification landscape in zebrafish, we found unanticipated differences compared to modifications previously described in eukaryotes.In humans, ms 2 i 6 A was reported at position 37 of mt-Phe-GAA and mt-Tyr-GTA ( 39 ); however, in Danio rerio both tRNAs encode a guanosine at position 37 ( 2 ).Using tRAM-seq, we detected misincorporation at both sites, which was abolished by demethylation (Figure 6 and Supplementary Figure S9 ), suggesting that in zebrafish G37 of mt-Phe-GAA and mt-Tyr-GTA is modified to m 1 G.
In another instance, we detected a signature of m 5 C at position 40 of GlyGCC / CCC, although only to an extent of about 50% (Figure 7 ).To the best of our knowledge, m 5 C40 was previously reported exclusively in archaea and yeast ( 10 ), thus we present first evidence of the existence of this modification in a higher eukaryote.
Finally, we observed a misincorporation signature at a U in the variable loop (position e2 / 47) in the two Ser-CGA clusters (Figure 6 and Supplementary File 2 ), which disappeared after demethylation ( Supplementary Figure S9 and Supplementary File 3 ).The only known base-methylation of the Watson-Crick face of U is 3-methyluracil (m 3 U), exclusively reported in rRNA ( 10 ).Bacterial AlkB can demethylate 3-methylthymine in DNA ( 55 ), and the human homolog of AlkB, FTO, was shown to demethylate m 3 U in ssRNA ( 56 ).Thus, we speculate that bacterial AlkB, or one of the mutant variants that we used, may be able to demethylate an m 3 U in tRNAs.However, whether tRNA-Ser-CGA in zebrafish really contains a m 3 U in the variable loop will require further validation.
Overall, with tRAM-seq analysis we elucidated the comprehensive modification landscape of a large array of tRNA modifications in the developing zebrafish embryo, highlighting similarities and differences with tRNA modifications described to date in subsets of tRNAs derived from various organisms.

tRNA modification is dynamic during embryo development
Comparing the modification signatures in tRNAs in the different samples collected and processed as mock, we observed diverse modification profiles, suggesting dynamic changes during development.In Figure 8 , we show selected, representative tRNA clusters and the extent of modification signature at different sites.Most modification sites appeared to be constitutively modified, as shown by a stable rate of misincorporation Figure 9. tRNA modification detection by primer extension.Primer extension analysis of ( A ) m 1 A14 in PheGAA and ( C ) m 1 A58 in GluCTC performed using AMV RT on total RNA extracted from zebrafish samples and a corresponding radiolabelled primer.In lane 1 only the primer was loaded; in lanes 2 and 3 RT products that were reverse transcribed from zebrafish ovary total RNA in presence of the indicated ddNTPs were loaded.Lanes 4-10 contain the RT products reverse transcribed from each sample.RT stops caused by modification and the full-length cDNA are annotated on the right side of the image.( B ) and ( D ): quantification of m 1 A14 in PheGAA and m 1 A58 in GluCTC detected as RT stops in (A) and (C), respectively.The RT stop fraction is quantified by dividing the modification-induced RT stop by all RT stops in the lane (excluding the unextended primer).
over the samples / developmental stages analysed, as exemplified by m 2,2 G26 in Ala-TGC / CGC, Leu-CAG, Phe-GAA, m 1 A58 in Leu-CAG and Phe-GAA, m 1 I37 in Ala-TGC / CGC, but also evident for many other modifications in different clusters (Figure 8 and Supplementary File 6 ).All modification sites in mitochondrial tRNAs appeared to be stable, as inferred from the rate of RT errors ( Supplementary File 6 ).However, a subset of modifications in nucleo-cytoplasmic tRNAs appeared to change during embryo development.For instance, misincorporation at position 20 in Ala-TGC / CGC, inferred as a site of acp 3 modification (see previous section), appeared to be lowest in the ovary samples, nearly double in activated eggs, remained stable at 1 hpf and 3 hpf, and decreased from 5 hpf onward (Figure 8 A).We observed highest modification signature in activated eggs and the earliest embryonal stages analysed for m 1 A58 in Ala-TGC / CGC and Glu-CTC, m 1 A9 in Asp-GTC, and m 1 A14 in Phe-GAA (Figure 8 ), but also for m 2,2 G26 in Arg-TCG, m 1 G9 in Glu-TTC / CTC, m 1 A58 in Gly-CCC, acp 3 U20 in Leu-TA G / AA G, m 1 G9 in Met-CAT, acp 3 U20 in Thr-CGT and Thr-TGT, m 2,2 G27 in Tyr-GTA, and to a lesser extent m 3 Ce2 in Ser-CGA ( Supplementary File 6 ).
The ovary samples showed a distinct modification profile, aligning neither with eggs and early embryo, nor with the later stages (Figure 8 and Supplementary File 6 ).This might reflect the content of oocyte precursors at different stages of maturation that may differ substantially from mature eggs, and / or their combination with the proportion of maternal (mature) tissue.
We validated the observed tRNA modification dynamics for two representative sites by primer extension ( 27 ).For both cases, m 1 A14 in Phe-GAA and m 1 A58 in Glu-CTC, the modification profile obtained by primer extension closely resembled the one observed by tRAM-seq, confirming the reproducibility of our results and the observed dynamics of tRNA methylation during embryo development (Figure 8 and 9 ).Also in the case of m 5 C modifications detected by bisulfite treatment, we observed dynamics across the analysed samples.While most sites displayed stable C retention, thus m 5 C modification extent, a subset of sites showed a modulated profile.Figure 10 shows the m 5 C profile of selected, representative tRNA clusters.Remarkably, all instances of m 5 C sites apparently modulated were at positions 40, 48, 49 and 50, which are all targets of the methyltransferase NSUN2 (Figure 10 and Supplementary File S7 ).However, not all NSUN2 target sites were modulated, and even adjacent target sites of NSUN2 within the same cluster were differentially modified, for instance in the case of positions 49 and 50 of Gln-CTG / TTG, and 48 and 49 of Phe-GAA (Figure 10 B and E).
To determine whether the expression of tRNA modification enzymes also changes during zebrafish embryo development, we referred to the list of all tRNA modification enzymes identified to date ( 57 ), and analysed their expression profile ( Supplementary Figure S11 ).The steady-state levels of the mRNAs of many tRNA modification enzymes appeared to change over the time course analysed ( Supplementary Figure S11 ), but did not appear to correlate with the abundance of the corresponding modification.For instance, the mRNA of the zebrafish homologue of TRMT5 appears to be more abundant in eggs, 1 hpf and 3 hpf, with levels declining later ( Supplementary Figure S11 ).TRMT5 is responsible for m 1 G37 and m 1 I37 methylation in many nucleocytoplasmic and mitochondrial tRNAs ( 8 ,58 ).According to our tRAM-seq analysis, m 1 G37 and m 1 I37 appeared to be substantially stable in all tRNA clusters where they were detected.In the case of TRMT10B, responsible for m 1 A9 in Asp-GTC ( 35 ), we observed a decrease of mRNA levels from 1 hpf to 10 hpf, and subsequent increase at 24 hpf ( Supplementary Figure S11 ).This profile is tentatively compatible with the decrease in m 1 A9 in Asp-GTC around 5-10 hpf and subsequent new increase at 24 hpf (Figure 8 B).The mRNA steady state levels of TRMT10A, responsible for m 1 G9 in many nucleocytoplasmic tRNAs ( 35 ), progressively declined from activated eggs until 5-10 hpf ( Supplementary Figure S11 ); however, the modification signature of m 1 G9 was substantially stable for all tRNAs target of TRMT10A, with the exception of Glu-TTC / CTC and Met-CAT, where it apparently decreased from 5 hpf onward ( Supplementary File 6 ).Also, the mRNA of CDKAL1, responsible for the modification of t 6 A (silent in tRAM-seq modification detection) to ms 2 t 6 A37 in tRNA Lys-TTT ( 59 ) appeared to increase over time from activated eggs to 24 hpf; however, the modification signature ms 2 t 6 A37 that we observed in the cluster Lys-TTT_Sup-TTA, consisting mainly of RT stops at position 37, was substantially unchanged ( Supplementary Figure S12 ).Similarly, the modification profile of further tRNA modifications did not match the apparent expression levels of the corresponding modification enzymes, as inferred from their mRNA levels ( Supplementary Figure S11 ).
In conclusion, here we show for the first time that tRNA modifications are dynamic during vertebrate embryo development.The molecular mechanism and determinants of the specific modification patterns cannot be attributed to the apparent expression levels of modification enzymes, and may entail more complex mechanisms affecting tRNA substrateselection and base-selection specificity.

Discussion
The apparent redundancy of tRNA genes is known since decades, and the expansion of the tRNA genes' copy number was suggested to be linked to the frequency of codons in the mRNAs of abundant proteins, at least in bacteria and yeast ( 60 ,61 ).The investigation of the actual (possibly differential) expression levels of tRNAs used to be limited to tedious hybridization-based methods like microarrays and northern blotting.Nevertheless, early work showed that in higher eukaryotes the pool of expressed tRNAs can vary between tissues, proliferative states, and between physiological and pathological conditions like cancer ( 62 ,63 ).
Here, we built upon recent developments in NGS-based methods for the study of tRNAs, and devised a wet lab and computational analysis pipeline, tRAM-seq, to investigate the dynamics of tRNA expression and modification during zebrafish embryo development.We combined a previously published protocol for tRNA sequencing using the highly processive R T TGIR T with improved reaction conditions ( 11 ) with the incorporation of randomized ligation termini aimed to minimize ligation bias ( 23 ), which we also used for PCR product deduplication.Also, similar to others ( 12 , 13 , 64 ), we used the demetylase AlkB and its mutant variants to demethylate samples specifically for tRNA abundance analysis, aiming to minimize any bias in representation of heavily-versus lightlymodified tRNAs.Noteworthy, we obtained sufficient coverage also for tRNAs carrying bulky modifications that are known to cause a hard stop to RT and that are not removed by AlkB, namely o 2 yW37 in Phe-GAA and ms 2 t 6 A37 in Lys-TTT ( 8 ).Furthermore, the comparison between the mock and DM samples allowed us to substantiate the inferred type of modification detected at specific positions.
In our analysis pipeline, we used the actual sequencing data to select and construct the best-matched reference to perform the final mapping of the tRNA reads.This approach proved effective for the analysis of D. rerio NGS data, which was not addressed by previous work probably due to the extreme expansion of tRNA genes' copy number in zebrafish (over 20000 including low-scoring predictions) ( 2 ).We further collapsed the still large number of tRNA genes found expressed in the DM samples into representative clusters, based on both sequence similarity and actual fraction of multimapping reads.The sequence differences in the clustered references are visualized in our coverage plots, adding a layer of information available in the analysis.Moreover, our mapping strategy did not make use of 'masking' known modified positions to accommodate mismatches at those sites ( 11 ), rendering our analysis independent of previous knowledge of modifications (which may be scarce for organisms other than the most investigated models).
Using tRAM-seq, we elucidated for the first time the dynamics of tRNA expression and modification during embryo development in the vertebrate model zebrafish.Asp-GTC appeared to be the most abundant tRNA, in particular at 10 hpf and 24 hpf, with up to 10% of the total tRNA reads mapping to the cluster Asp-GTC (Figure 3 A and B).Asp-GTC also displayed a strong modulation across the samples analysed, increasing from about 5% up to 10% of the total tRNA reads.The only previous study that quantified tRNA expression in zebrafish embryos, at the single time point of 6 hpf, similarly found Asp-GTC being the most abundant tRNA ( 65 ); however, the authors did not observe a high frequency of aspartateencoding codons in the corresponding transcriptome.Whilst we generally see correlation between tRNA abundance and codon frequency at the individual time points, we did not observe a significant correlation between the changes in abundance of tRNA-Asp or other tRNAs and the frequency of the corresponding codons in the time course analysed (Figure 3 and Supplementary Figure S8 ).Another abundant tRNA in our datasets was Lys-CTT: its abundance levels, in terms of fraction of total tRNA, nearly doubled from about 3% in activated eggs and 4-cell stage, to about 6% at 24 hpf (Figure 3 ).Lys-CTT was one of the few cases where we observed some correlation (although not statistically significant) between tRNA abundance and frequency of codons, as the frequency of the codon Lys-AAG similarly increased over the time course analysed ( Supplementary Figure S8 ), possibly suggesting that the increase in tRNA-Lys-CTT is needed to match the translational requirements of Lys-containing proteins.A direct correlation between tRNA availability and codon frequency has been early reported for bacteria and yeast ( 66 ,67 ) and is widely considered as determinant of translation efficiency and mRNA stability (reviewed in ( 68 )).However, experimental data in higher eukaryotes appeared to do not generally confirm a correlation between tRNA abundance and codon frequency ( 62 ).Even the postulated link between codon bias and translational efficiency was subject to controversy (69)(70)(71)(72).Part of the inconsistencies may have been due to the scarcity of reliable tRNA abundance data, which were mostly substituted for by tRNA gene copy numbers or Pol III occupancy as proxy ( 71 ).Remarkably, in the recent work by Gao et al., the authors found general correlation between tRNA abundance and codon frequency, but the changes in tRNA pools after differentiation did not correlate with the translational efficiency of the corresponding codons ( 73 ).It was previously suggested that the selection of non-optimal (rare) codons may play a role in coordinating the expression of specific subsets of genes ( 74 ,75 ).Thus, the changes in levels of tR-NAs may not directly reflect the codon frequency of the whole transcriptome (which appears to be substantially stable) but have a more complex impact on the rate of translation of subsets of mRNAs.The tuning of the available tRNA repertoire may have evolved together with the multiple mechanisms regulating translation to support the specific, rapidly changing expression program of the developing embryo (76)(77)(78).Alternatively, the relative abundance of different tRNAs may be linked to alternative tRNA functions, for instance in interaction with proteins, RNAs, or metabolic pathways ( 1 ).In the case of mitochondrial tRNAs, the apparent higher levels of mt-Val-TAC may be due to its incorporation as a subunit of the mitochondrial ribosome ( 79 ), and the observed increase from 5 hpf may be due to the new synthesis of organellar ribosomes.
When we dissected the tRNA clusters into individual isodecoders, we observed a major switch in prevalent tRNAs present in activated eggs and early embryos versus later stages.The switch in tRNA expression appears to happen from 5 hpf onward, which corresponds to the start of gastrulation, i.e. the massive migration of embryonal cells and establishing of distinct embryonal layers ( 18 ).The observed change in tRNA ensemble can be reasonably attributed to the replacement of the original, maternally deposited tRNAs present in the egg by zygote-expressed tRNAs.Whether the maternal tRNA decline is due to dilution by newly transcribed tRNAs, regular turnover, or active / selective degradation remains to be clarified.In any case, how the different tRNA genes scattered across the zebrafish genome are selectively activated for transcription is not known.The eukaryotic RNA polymerase responsible for tRNA transcription is Pol III, and it was recently shown that Pol III transcribes different tRNA loci in human induced pluripotent stem cells (hiPSCs) versus cardiomyocytes and neurons ( 73 ).The results of Gao and co-workers largely align with our findings, suggesting that the apparent change in isodecoder expression during development / differentiation is conserved in humans.However, a few differences can be observed.For example, Lys-CTT appeared to be stable or slightly decreased in cardiomyocytes or neurons, respectively, versus hiPSCs, whereas we observed an increase of Lys-CTT during embryo development (Figure 3 B,C).The discrepancies may (i) be due to tissue-specificity of tRNA reprogramming (as in cardiomyocytes or neurons vs. whole embryo), (ii) be linked to the differentiation protocol used and the degree of maturation achieved, or (iii) still reflect organism-specific differ-ences.Alternatively, differences may be due to the character of stable cells lines (pluripotent or terminally differentiated) and their selection / adaptation to cultivation condition, versus the dynamic, physiological process of zebrafish embryo development.
Still, how different tRNAs can be selectively expressed at different stages of embryo development remains to be clarified.In zebrafish, most tRNA genes are located within highly repetitive clusters, especially on the long arm of chromosome 4, in addition to isolated tRNA genes scattered across the entire genome ( 2 ,5 ).We have noted a striking pattern where isodecoders expressed only in later stages are present in single or few isolated copies across the genome, whereas bona fide maternally deposited tRNAs are encoded in highly repetitive regions (Figure 5 ).Similarly, maternal 5S rRNA was shown to be encoded in large clusters of repeats on the long arm of chromosome 4 ( 3 ).Chromosome 4 in zebrafish has the highest content of repetitive sequences, including many different families of transposable elements, is characterized by high degree of heterochromatinization, and was proposed to act as sex-determining chromosome (80)(81)(82).It is intriguing to speculate that the genomic context may contribute to determine which tRNA genes are expressed during gametogenesis versus later developmental stages, via chromatin remodelling or de-repression / repression of repetitive sequences.It is worth noting that the current tRNAscan-SE-based predictions of tRNA genes reported in GtRNAdb exclude in the so-called 'tertiary filter' tRNAs that have an excessive number of gene copies (labelled as 'NNN' ( 83)).Given our findings that some of those tRNAs appear to be present in eggs and early embryo ( Supplementary File 5 ), it may be advisable to reconsider repetitiveness as rational for excluding gene predictions from the high-confidence tRNA ensemble.
Similar to the analysis of tRNA expression, the study of (t)RNA modifications has lagged behind due to technical challenges.Early work relied on the use of radioactive labelling and separation by TLC ( 84 ), or the use of high-performance liquid chromatography and mass spectrometry ( 85 ).These methods require the isolation of large amounts of the specific RNA of interest to be analysed, and are severely limited in applicability to endogenous RNAs.The recently available NGS-based methods overcome this limitation allowing to investigate a broad range of modifications at transcriptomic level with relatively little amount of starting material ( 16 ).These new methods have thus opened the avenue for investigating the differential modification of tRNAs across tissues / samples, conditions, developmental stages, etc.It is worth noting that whilst A-to-I editing is consistently read as G in sequencing experiments, other modifications are detected as a combination of different misincorporations / stops, where also the correct base is incorporated to a certain extent during RT.The modification signatures observed, consisting of misincorporations and RT stops, cannot be directly translated into absolute values since the type and extent of RT errors depend on multiple factors, including the type of modification itself, the adjacent sequence, and the RT used ( 11 , 31 , 86 ).However, the percentage of error represents the minimum possible extent of modification, and it is reasonable to assume that high misincorporation rates correspond to complete or nearly complete modification.In any case, the relative comparison of individual modification sites between matched samples and / or time points (like in our case) remains valid as the type of modification and sequence context remain unaltered.
To date, the only studies of tRNAs and their modification profile during development were conducted in the amoeba Dictyostelium discoideum , which develops from single cell to a multicellular fruiting body in response to starvation ( 87 ), and in Drosophila ( 88 ), whilst no study addressed so far tRNA expression and modification during development in vertebrates.Here we profiled eleven different tRNA modifications, namely: I, m 1 I, m 1 G, m 1 A, m 2 2 G, m 3 C, m 5 C, acp 3 U, ms 2 i 6 A, ms 2 t 6 A, and o 2 yW.We showed that the extent of modification at a subset of tRNA positions changes during zebrafish embryo development, prevalently displaying higher modification signature in activated eggs and early developmental stages, and a decrease in modification after 5 hpf.These dynamics are unlikely to be generally due to changes in the abundance of the responsible modification enzymes, as we did not observe a correlation with their mRNA expression levels ( Supplementary Figure S11 ) or with their protein levels in similar samples that we recently reported ( 89 ).Furthermore, different sites modified by the same enzyme showed different modification profiles.In a few instances, changes of modification signature could be attributed to an actual change in abundance of different tRNAs within a cluster, of which only some can be modified.This is the case for I34 in Thr-A GT / CGT / TGT and Val-CA C / TA C / AA C, which are the only sites where we rather observed an increase in modification signature at 10 hpf and 24 hpf as compared to earlier stages ( Supplementary File 6 ).Here, the increase in modification signature is due to the relative increase of abundance of the Thr-AGT and Val-AAC isodecoders within the respective cluster (Figure 3 C and Supplementary File 5 ).In other instances, the different modification extent may be the consequence of differences in the activity and / or affinity of the modification enzyme for specific tRNA positions.For example, in the case of the cluster Ala-TGC / CGC, the isodecoder Ala-TGC-1 is prevalently expressed in early time points, and after 5 hpf appears to be replaced by Ala-TGC-4; in parallel, the overall extent of m 1 A58 methylation in the cluster appears to decline.Actually, the extent of m 1 A58 modification remains consistently high for Ala-TGC-1 and low for Ala-TGC-4; thus, the apparent lower modification of the cluster at later time points is due to the change in relative abundance of the different isodecoders.Ala-TGC-1 and Ala-TGC-4 differ in the variable loop by having U47 or C47, and in the T-stem where the base-pair 50-64 is U:A or C:G, respectively.Interestingly, the methylation of A58 by the TRMT61A-TRIMT6 complex requires conformational rearrangements of the Tand D-stem loops ( 90 ); thus, the identity of the bases and / or base-pairs present may affect the flexibility of the regions involved, and ultimately the efficiency of methylation.Given our observation of a change in isodecoder expression during embryo development, it is intriguing to hypothesize that differences in sequence (and possibly structure as consequence) between different isodecoders may act as determinants for -or affect recognition by-modification enzymes, entailing a coordinated modulation of isodecoder expression and modification.The remodelling of the tRNA transcriptome in terms of isodecoder expression and modification during embryo development holds the potential of fine tuning the function of tRNAs in protein translation and beyond.The full range of functional consequences of this phenomenon remains an open question.
C for 30 min in 70 mM Tris-HCl pH 9. Deacylation was omitted for bisulfite converted tRNA.Each tRNA sample was dephosphorylated by incubation at 37 • C for 45 min in a volume of 100 μl containing 1X PNK Reaction buffer, 10 U of T4 Polynucleotide Kinase (NEB) and 40 U Murine RNase Inhibitor (NEB), followed by PCI extraction and ethanol precipitation. 3 -adapter ligation: the tRNA pellets were redissolved and a pre-adenylated adapter (rApp-NNNNNNCTGTA GGCA CCA TCAA T-ddC) ( 23 ) was ligated to the 3 -end by incubation in a reaction volume of 20 μl, containing 20 pmol 3 -adapter, 1X T4 RNA ligase buffer, 25% PEG-8000, 200 U T4 RNA ligase 2 (truncated KQ, NEB) and 40 U Murine RNase Inhibitor (NEB) for 4 hours at 25 • C. The ligation products were separated on a 10% denaturing polyacrylamide gel and gel pieces spanning the range of about 55 nt to 300 nt were excised, crushed, and eluted over night at room temperature.The eluted tRNA was PCI extracted and precipitated.Reverse transcription (RT): the adapter-ligated tRNA pellets were redissolved and annealed in a volume of 12 μl to 2.5 pmol RT primer (p-NNNAGATCGGAAGAGCGT CGTGT AGGGAAAGAGTGTAGA TCTCGGTGGTCGC-Spc18-C ACTC A-Spc18-TTCA GA CGTGTGCTCTTCCGA TCTATTGATGGTGCCTA CA G) ( 23 ) by 2 min of incubation at 85 • C followed by 5 min of incubation at 25 • C. The reverse transcription was carried out in a reaction volume of 20 μl containing the annealed tRNA-RT primer, 500 nM TGIRT-III (InGex, or available from Dr A. Lambowitz, University of Texas, Austin), 1X Protoscript II buffer (50 mM Tris-HCl pH 8.3, 75 mM KCl, 3 mM MgCl 2 ) 40 U Murine

Figure 1 .
Figure1.Library construction and computational w orkflo w of tRAM-seq.( A ) The tRNA fraction is isolated from total RNA and subsequently subject to treatments for demethylation (AlkB treatment), bisulfite conversion, and end repair (deacylation and T4 PNK treatment).( B ) The tRNA fraction is ligated to the pre-adenylated adapter (red) and reverse transcribed starting from the RT primer (green).Subsequently the cDNA is circularized and amplified from the RT primer sequence.The RT primer contains two internal spacers ( •) to prevent circular amplification.Both the adapter and RT primer contain random nucleotides (N) at the 5 -end to reduce ligation bias and remo v e PCR duplicates during computational analysis.( C ) Computational pipeline of tRAM-seq data analysis.

Figure 2 .
Figure 2. Co v erage plots with modification signature.Example co v erage plots with reference nucleotide identity of the cluster Gln-CTG / TTG f or RNA isolated from zebrafish ovary and processed as ( A ) mock sample, ( B ) DM sample or ( C ) BS sample.Light colours indicate correctly mapped bases, dark colours highlight misincorporated bases, and grey highlights deletions.In ( C ), light orange colour represents C retention, indicative of m 5 C methylation.Numbers on the x-axis indicate canonical nucleotide position, and the amount of reads per million are shown on the y-axis.

Figure 3 .
Figure 3. tRNA abundance during zebrafish embryo development.( A ) Heat-map representing the normalized abundance of clustered nucleo-cytoplasmic and mitochondrial tRNAs (y-axis) in the analysed DM samples spanning zebrafish embryo development, plus ovaries (x-axis).The total number of reads mapped to the tRNA cluster were normalized by the total read count, and plotted per sample and time point.The blue colour scale indicates the fraction of reads mapped to the individual tRNA clusters.Data are means of three biological replicates with the e x ception of the activated eggs ( n = 2).( B ) Abundance dynamics of selected nucleo-cytoplasmic and mitochondrial tRNA clusters.The line represents the mean of the three biological replicates with the e x ception of the activated eggs ( n = 2).Replicates are indicated by square, circle and triangle.( C ) Plots showing abundance dynamics within tRNA clusters.Individual tRNA references are labelled and plotted in coloured lines if representing a minimum fraction of mapped reads of 0.002, the rest being plotted as grey lines.The line represents the mean of the biological replicates like in (B).

Figure 4 .
Figure 4. Quantification of tRNA abundance by northern blot.( A ) The expression of selected tRNAs was analysed by northern blotting (upper six panels).The denaturing polyacrylamide gel was stained by GelRed before blotting and imaged to be used as loading control (bottom panel).( B ) Quantification of tRNA abundance from northern blots (A) calibrated according to the GelRed-stained tRNA fraction (A, lower panel) and normalized to the ovary.

Figure 5 .
Figure 5. Genomic context of early and late expressed tRNAs. ( A ) B o xplot of the mean tRNA gene count in a 10 kb range of early and late expressed tRNA s. tRNA s increasing at least 4-fold in abundance in 24 hpf versus 1k-cell stage of the demethylated tRNA libraries are assigned to the 'late expressed' ( n = 31), while the 'early expressed' group ( n = 113) contains tRNAs that show a < 4-fold increase in abundance.Unique isodecoders were included when having at least a fraction of 0.002 of reads mapped in either the 1k-cell or 24 hpf stage.The box extends from the 25th to the 75th percentiles; the whiskers extend to 1.5 times the inter-quartile range from the hinge.The p-value was calculated using the Mann-Whitney U test.**** indicates P -value < 0.0 0 01 (B-D) Genomic context of representative tRNA genes.Black boxes represent predicted tRNA genes from the R epeatMask er track in the UCSC genome browser; gene names of tRNAs highlighted in blue are annotated on the right.Se v eral e xamples are sho wn f or early e xpressed ( B ) and late e xpressed ( C ) tRNAs; ( D ) examples of isodecoders that have identical gene copies predicted in both repetitive and isolated contexts.

Figure 6 .
Figure 6.tRNA modification landscape of 4-cell stage zebrafish embryos.Heat-map of mean misincorporation fraction of all nucleo-cytoplasmic tRNA clusters and mt-tRNAs (y-axis).Canonical nucleotide positions are annotated on the x-axis.Nucleotide positions that are rarely present in the tRNA clusters are annotated with a dot.blue colour scale indicates the mean mismatch fraction across three biological replicates.

Figure 7 .
Figure 7. tRNA m 5 C methylation landscape of 4-cell stage zebrafish embryos Heat-map of C retention, indicative for m 5 C methylation, for all nucleo-cytoplasmic tRNA clusters and mt-tRNAs (y-axis).Nucleotide positions that are rarely present in the tRNA clusters are annotated with a dot.The blue colour scale indicates the mean C retention fraction across two biological replicates.Some tRNA positions show C retention while most likely are not m 5 C methylated: the cluster MetCAT includes tRNAs that either have a C or a U at position 20.The tRNAs possessing a U20 are most likely modified to acp 3 U, which in our RT frequently leads to misincorporation of a G.This results in a C being mapped to a position that has a C as alternative reference, hence the C retention signal.Other likely artefacts are position 61, 62, 63 of the cluster GlyCCC that most likely are the result of incomplete bisulfite con v ersion due to str uct ure.Lastly , the C retention signal at the 5 end of the clusters mt-L y sTTT, mt-P roTGG and mt-V alTA C are most likely caused by over-trimming of leading Ts (see experimental details in Supplementary File 1 ).

Figure 8 .
Figure 8. tRNA modification dynamics during zebrafish embryo de v elopment.( A-E ) Misincorporation fraction in five selected tRNA clusters throughout de v elopment al st ages of z ebrafish embry os, plus o v ary (x-axis).Graphs are shown for tRNA nucleotide positions that have a misincorporation fraction equal or greater than 0.15 in at least one time point.The tRNA clusters are indicated on top of the plots; the predicted modifications causing the misincorporation are indicated in the legend on the right.Lines represent the mean of the three biological replicates, which are indicated by square, circle and triangle.

Figure 10 .
Figure 10.tRNA m 5 C methylation dynamics during zebrafish embryo de v elopment.( A-E ) C-retention fraction, indicative of m 5 C modification, of five selected tRNA clusters throughout developmental stages of z ebrafish embry os, plus o v ary (x-axis).Lines sho w the mean C-retention fraction of positions that ha v e a C-retention fraction equal or greater than 0.15 in at least one time point ( n = 3, e x cept f or activ ated eggs and 4-cell stage where n = 2).